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Abstract 

Background: The SH2B1 gene (Src-homology 2B adaptor protein 1 gene) is a solid candidate gene for obesity. 
Large scale GWAS studies depicted markers in the vicinity of the gene; animal models suggest a potential relevance 
for human body weight regulation. 

Methods: We performed a mutation screen for variants in the SH2B1 coding sequence in 95 extremely obese 
children and adolescents. Detected variants were genotyped in independent childhood and adult study groups (up 
to 1 1,406 obese or overweight individuals and 4,568 controls). Functional implications on STAT3 mediated leptin 
signalling of the detected variants were analyzed in vitro. 

Results: We identified two new rare mutations and five known SNPs (rs1 47094247, rs7498665, rs60604881, 
rs62037368 and rs62037369) in SH2B1. Mutation g.9483C/T leads to a non-synonymous, non-conservative exchange 
in the beta ((3Thr656lle) and gamma (yPro674Ser) splice variants of SH2B1. It was additionally detected in two of 
1 1,206 (extremely) obese or overweight children, adolescents and adults, but not in 4,506 population-based normal- 
weight or lean controls. The non-coding mutation g.10182C/A at the 3' end of SH2B1 was only detected in three 
obese individuals. For the non-synonymous SNP rs7498665 (Thr484Ala) we observed nominal over-transmission of 
the previously described risk allele in 705 obesity trios (nominal p = 0.009, OF!= 1.23) and an increased frequency of 
the same allele in 359 cases compared to 429 controls (nominal p = 0.042, OR= 1.23). The obesity risk-al leles at 
Thr484Ala and /3Thr656lle/yPro674Ser had no effect on STAT3 mediated leptin receptor signalling in splice variants 
P and y. 

Conclusion: The rare coding mutation /3Thr656lle/yPro674Ser (g.9483C/T) in SH2B1 was exclusively detected in 
overweight or obese individuals. Functional analyzes did not reveal impairments in leptin signalling for the mutated 
SH2B1. 
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Background 

A large-scale genome-wide association study (GWAS) 
meta-analysis including a total of 249,796 individuals 
of European ancestry confirmed 14 known obesity- 
susceptibility loci and newly identified 18 genetic loci 
associated with body mass index (BMI). One of the re- 
identified single nucleotide polymorphisms (SNPs) is 
located near the Src-homology 2B adaptor protein 1 
(SH2B1) gene (rs7359397) [1]. Association with obesity 
was also shown for a coding SNP in SH2B1 (rs7498665: 
g.8164A/G, Thr484Ala; [2,3]). Linkage disequilibrium 
between rs7359397 and the coding SNP rs7498665 is 
high (r 2 = 0.965, D' = 1; HapMap, http://hapmap.ncbi. 
nlm.nih.gov/). Both SNPs are located within a large link- 
age disequilibrium (LD) block. A region of more than 
500 kb upstream and 150 kb downstream of SH2B1 is 
flanked by recombination peaks of 37 cM/Mb or 36 cM/ 
Mb, respectively (SNP Annotation and Proxy Search 
SNAP, see Additional file 1: Figure SI). The association 
of increased BMI with SH2B1 SNP (rs7359397 and 
rs7498665) alleles has been robustly replicated in e.g. (i) 
4,923 Swedish adults [4], (ii) in 12,462 individuals from 
the German MONIKA/KORA study [5], and (iii) in 
1,045 obese adults and 317 healthy lean individuals from 
Belgium [6]. 

A deletion of ~200 kb covering SH2B1 was recently 
shown to be associated with severe early-onset obesity 
[7], whereas the corresponding reciprocal duplication 
was associated with leanness [8]. Additionally, a larger 
interspersed deletion extending through a 593 kb region 
on chromosome 16pll.2-pl2.2 covering SH2B1 has 
been associated with developmental delay, feeding diffi- 
culties, dysmorphic facial features, and obesity [9,10]. 
Bochukava et al. [7] screened the coding region of 
SH2B1 for causal mutations by re-sequencing of 500 
early onset severely obese children of the Genetics of 
Obesity Study (GOOS). The investigators detected SNP 
rs7498665 (Thr484Ala [7]); rare variants were not identi- 
fied. Evidence in humans and from animal models sug- 
gests that SH2B1 is a likely obesity gene. In humans, the 
SH2B1 protein increase serum leptin levels and whole 
body fat mass in females [11]. The influence of SH2B1 
variants on the distribution of body fat and the amount 
of visceral adipose tissue is still under discussion 
[4,12,13]. With regard to animal models, Sh2bl null 
mice show a phenotype of obesity, hyperlipidemia, leptin 
resistance, hyperphagia, hyperglycaemia, insulin resist- 
ance and glucose intolerance [14]. This phenotype was 
consistent when the knockout was regionally limited to 
hypothalamic neurons [15] and functionally limited to 
induced mutations in the Src-homology 2 (SH2) and 
pleckstrin homology (PH) domains [16]. Selective rescue 
in neurons eliminated both obesity and the insulin re- 
sistance phenotype [16]. Additional evidence for an 



involvement of Sh2bl in the regulation of energy 
homeostasis is derived from expression analyses in mice 
and rats. In DIO (diet-induced obese) rats, fed a high fat 
diet, the expression of Sh2bl in hypothalamus was 
decreased [17], while in mice on a high fat and high 
carbohydrate diet, Sh2bl expression increased in the 
same tissue [18]. 

We initially detected transmission disequilibrium for a 
SNP in the vicinity of SH2B1 (rs2008514) in 705 obesity 
trios (p =0.0094 of TDT). The SNP is a proxy of 
rs7359397, which lit up in large scale GWAS [1]. We 
screened the coding region of SH2B1 for (infrequent) 
mutations in 95 extremely obese children and adoles- 
cents. For the GWAS-derived gene for type 2 diabetes 
mellitus melatonin receptor IB (MTNR1B) it was re- 
cently shown that a number of rare to infrequent muta- 
tions can be detected in the respective patients [19]. In 
addition, based on the evidence for involvement of 
SH2B1 in energy homeostasis, rare coding variants in 
the gene could potentially result in monogenic obesity. 
Subsequently, we assessed association of the identified 
variants to obesity in independent study groups. In vitro 
analyzes of the impact on leptin receptor signalling for 
the detected variants ensued. 

Material & methods 

Study groups 

An overview of the ten study groups is given in Table 1, 
details have been described previously [20,21]. The selec- 
tion of individuals for the mutation screen was based on 
genotypes at SNP rs2008514 (proxy of rs7359397) in the 
vicinity of SH2B1. In total, we analyzed 95 individuals, 
90 of whom were likely enriched for the presence of 
mutations in SH2B1. The other five individuals are 
heterozygous carriers of a deletion at chrl6pll.2 which 
does not harbor SH2B1 [10]. These extremely obese 
patients (offspring) from the family-based GWAS 
sample were homozygous for the risk allele T of 
rs2008514 and had at least one heterozygous parent, 
thus substantially contributing to the observed over- 
transmission of the rs2008514 T-allele. Association 
to obesity of detected variants was analyzed in three 
steps (Figure 1): 

i. Association testing: All detected variants were 
genotyped in a sample of 179 extremely obese 
(age- and sex-adjusted BMI percentile > 99 th ; [22]) 
children or adolescents and 185 lean adult 
(age- and sex-adjusted BMI percentile < 15 th ; [22]) 
controls. Basic phenotypical characteristics are given 
in Table 1. Individuals were independent of the 
mutation screening sample and were part of our 
case-control GWAS sample (Genome-Wide Human 
SNP Array 6.0, see [20,21]). 
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Table 1 Phenotypic description of analyzed study groups 



Sample 3 


status 


n 


% male [%] 


age [mean ± SD] 


BMI [mean ± SD] 


BMI SDS b [mean ± SD] 


Mutation Screen 


Cases 


95 


48.42 


1 3.43 ± 3.37 


31. 87 ±5.04 


4.10 ±1.71 


S,: family-based GWAS 


Cases 


705 


45.11 


1 3.44 ± 3.02 


32.02 ± 5.82 


4.23 ±1.96 




Parents 


1,410 


50.00 


42.54 ± 6.02 


30.28 ± 6.33 


1 .65 ± 1 .84 


S M : case-control GWAS 


Cases 


453 


42.60 


14.37 ±3.75 


33.1 5 ±6.68 


4.51 ±2.15 




Controls 


435 


39.08 


26.08 ± 5.75 


1 8.09 ± 1.14 


-1.45 ±0.34 


S m : subset of case-control GWAS 


Cases 


179 


49.16 


14.27 ±2.39 


35.56 ±6.1 3 


5.3 ± 2.09 


(for association testing) 


Controls 












185 


55.68 


25.56 ± 3.94 


1 8.39 ± 1.09 


-1.47 ±0.33 


S, v : obese adults 


Cases 


988 


37.25 


47.1 7 ±14.23 


35.70 ±5.43 


3.22 ±1.66 


S v : DAPOC 


Cases 


1,185 


44.22 


10.72 ±2.78 


27.69 ±5.11 


3.07 ±1.56 


S VI : KORA 


Cases 


6.633 


56,43 


56.62 ±13.1 3 


29.42 ± 3.70 


1.1 9 ±1.09 




Controls 


3.444 


36,76 


47.18 ±13.37 


22.72 ± 1 .68 


-0.53 ±0.51 


S V ii: Ulm children's study 1 


Cases 


97 


57.73 


7.57 ± 0.42 


20.68 ± 1.71 


1 .06 ± 0.46 




Controls 


685 


54.31 


7.56 ± 0.42 


1 5.63 ± 1.38 


-0.25 ±0.37 


S V |„: BEPOC 


Cases 


1,046 


47.99 


10.89 ±3.57 


29.46 ±5.91 


3.54 ± 1.82 


Six: Ulm children's study 2 


Cases 


271 


51.29 


11.07 ±3.69 


29.75 ± 6.04 


3.62 ± 1 .88 


S x : Ulm children's study 3 


Cases 


129 


43.41 


14.90 ± 1.84 


40.45 ± 8.00 


7.05 ± 2.92 



a Mutation Screen sample: part of the family-based and the case-control GWAS samples' cases; Family-based GWAS sample: 705 index patients with early-onset 
extreme obesity and their biological parent; case-control GWAS sample: GWAS of early-onset extremely obese children and adolescents in comparison to lean, 
adult controls; case-control sample for association testing: early-onset extremely obese children and adolescents in comparison to lean, adult controls; 
independent of initial screening sample; subset of cases-control GWAS sample; DAPOC: Datteln Paediatric Obese Cohort (27); KORA: Cooperative Health Research 
in the Region of Augsburg (30); Ulm children's study 1: Ulm Research on Metabolism, Exercise and Lifestyle in Children (31); BEPOC: Berlin Paediatric Obese Cohort 
(28); Ulm children's study 2 and 3: Ulm Paediatric Obese Cohort A and B (INSULA) (29). 

Calculation of the BMI SDS values has been based on population reference values following the National Nutrition Survey I (25). 



ii. Further exploration of non-synonymous variants: 
The three non- synonymous variants (rsl47094247: 
g.2749C/A - Thrl75Asp; rs7498665: g.8164A/G - 
Thr484Ala; g.9483C/T - (3Thr656Ile/yPro674Ser) 
were additionally genotyped in the remaining 
individuals of the family-based sample and the 
cases and controls of our GWAS sample, who 
were not screened for mutations (see Mutation 
Screen section) and in 988 obese adults [23] as 
well as in 1,185 independent obese children or 
adolescents of the 'Datteln Paediatric Obese 
Cohort' (DAPOC [24]; Table 1). 

iii. Additional exploration of rare non-synonymous 
variants: The two coding mutations without 
previously shown association to obesity |3Thr656Ile/ 
yPro674Ser and rsl47094247 - Thrl75Asp - were 
additionally genotyped in three independent study 
groups comprising obese children and adolescents 
('Berlin Paediatric Obese Cohort' — BEPOC; 

n = 1,046 [25]; Ulm Children's Study 2 and 3; 
n = 271 and 129, respectively [26]; Table 1) and in 
two independent population-based cohorts, the 
'Cooperative Health Research in the Region of 
Augsburg' (KORA; n = 10,077 [27]) cohort of adults, 
and the 'Ulm Children's Study 1' (n = 782 [28]) study 
group of children and adolescents (Table 1). 



In all samples, body mass index (BMI in kg/m 2 ) was 
assessed and age- and sex-specific percentile criteria with 
regard to the German population at the time of sample re- 
cruitment (Si, S n , Sm, Spy, Syi [22]) were used to define 
overweight or obese cases. Samples were divided into cases 
(adults: BMI > 25, children and adolescents: > 90 th BMI 
percentile (www.mybmi.de)) and controls (adults: BMI < 
25, children and adolescents: BMI < 90 th percentile 
(www.mybmi.de)). Written informed consent was given 
by all participants and in case of minors by their par- 
ents. These studies were approved by the Ethics Com- 
mittees of the respective Universities and Institutions 
and were performed in accordance with the Declaration 
of Helsinki. 

Mutation screen 

The coding region of SH2B1, located at chrl6: 28,858,010 
- 28,885,533 (hgl8/ NCBI 36), was screened for mutations 
by denaturating high pressure liquid chromatography 
(dHPLC, WAVE, Transgenomics) as described previously 
[29]. Accuracy of dHPLC is similar or even higher than 
sequencing [30]. To enhance the sensitivity of detection of 
homozygous mutation carriers, DNA of an individual with 
wild type genotype (re-sequenced) was added to each 
sample prior to PCR amplification. All samples with 
deviant dHPLC patters were re-sequenced. Detailed 
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1. Candidate gene selection and enrichment based on GWAS 



Selection based on 
bestSNPin SH2B1 
:region rs2008514 



2. Mutation screen 
via dHPLC 



Screening Sample 
95 extremely obese 
index patients 



Detected: 
New variants: 

g.9483C/T (|5Thr656lle/vPro674Ser). 

and g.10182C/A 

Known SNPs: 

rs1 47094247 (ThM 75Asp). 

rs7498665 (Thr484Ala). rs60604881 . 

rs62037368. and rs62037369 



3. Association testing 
and confirmation 



i) All detected 
variants were 
genotyped in 



179 mutation screen 
independent obese 
cases 



rs147094247 (TIK175ASP). g SWOT (SThi 



r6^ppr< 



185 lean, adult 
controls 



o674Ser). r$7498665 (Tht484Ala) 



ii)AII non- 
synonymos 
variants were 
genotyped in 



Family-based 
GWAS 

705 trios (extremely 
obese child or 
adolescent and both 
biological parents) 


Case-control 
GWAS 

453 extremely obese 
children and 
adolescents and 435 
normal weight or 
lean adult controls 


1.1 85 obese children 
from the DAPOC 
collective 


988 obese adults 



ISU7094247 (Thf 175Asp). g i 



T OThr65*ll«/yPro674Set) 



10.077 individuals 


782 children from the 


from KORA 


Ulm children's 


collective 


study 1 collective 


1 .046 obese children 


400 obese children 


from the BEPOC 


from the Ulm 


collective 


children's study 2 




and 3 collective 



population 

Ixjsed 

Muses 



iii)The possibly 
functionally 
relevant 
mutation was 
additionally 
genotyped in 



Figure 1 Flow chart for experimental setup. GWAS: 
genome-wide association study; SNP: single nucleotide 
polymorphism; dHPLC: denaturating high performance liquid 
chromatography; DAPOC: Datteln Paediatric Obese Cohort 
(Reinehr et al. 2007); KORA: Cooperative Health Research in the 
Region of Augsburg (Wichmann et al. 2005); Ulm children's 
study 1: Ulm Research on Metabolism, Exercise and Lifestyle in 
Children (Nagel et al. 2009); BEPOC: Berlin Paediatric Obese 
Cohort (Bau et al. 2009); Ulm children's study 2 and 3: Ulm 
Paediatric Obese Cohort A and B (INSULA) (Wabitsch et al. 2004). 



information regarding used temperatures and pri- 
mers for PCR and dHPLC analysis can be obtained 
from the authors. All PCR amplicons with dHPLC 
patterns deviant from the wild-type pattern were re- 
sequenced as described previously [31]. At least two 
experienced individuals independently assigned the 
genotypes; discrepancies were solved either by reach- 
ing consensus or by re-genotyping. 



Genotyping 

The identified variants in SH2B1 were genotyped in lar- 
ger study groups using either restriction fragment length 
polymorphism (RFLP) or TaqMan Assays (detailed in- 
formation can be obtained from the authors). For 
rsl47094247 (Thrl75Asp) and the new coding mutation in 
fragment 9 (g.9483C/T / SThr656Ile/xPro674Ser), custom 
TaqMan assays were designed (SH2B1_2I_MUT, Assay ID: 
AHCS0BY, and Frag9_mutl, Assay ID: AHMSHDX, re- 
spectively, both Applied Biosystems). At least two experi- 
enced individuals independently assigned the genotypes; 
discrepancies were solved either by reaching consensus or 
by re-genotyping. 

Statistics 

Allele and genotype distributions of all detected variants 
did not deviate from Hardy- Weinberg equilibrium. To 
analyze the obesity association of all variants, Fisher's 
exact test (allelic association) was calculated with PLINK 
[32]. Population-based samples were divided into cases 
(BMI > 90 th percentile) and controls (BMI < 90 th percen- 
tile). For rs7498665 an asymptotic, 2-tailed p-value for 
the transmission disequilibrium test (TDT) was add- 
itionally calculated with PLINK. If not stated otherwise, 
all p-values are asymptotic, two-sided and not corrected 
for multiple testing. 

Functional in silico analyzes 

To determine the potential alteration in gene expression, 
all mutations were analyzed for loss or gain of cryptic 
splice sites, transcription factor binding sites and gain or 
loss of o-glycosilation sites. Prediction of possible impact 
of amino acid exchange on structure and function of 
SH2B1 was done by PolyPhen-2 [33], SNAP [34], PMUT 
[35], and MutationTaster [36]. Detailed description of 
used tools can be found in the Additional file 1: Supple- 
mentary materials. Conservation was analyzed by align- 
ing sequences of 21 species in total (21 a, eight [3 and 
six y sequences, comprehensive list in Additional file 1: 
Supplementary materials). 

Functional in vitro analyzes: STAT3 mediated leptin 
receptor signalling 

The effect of SH2B1 harboring the infrequent alleles of 
Thr484Ala and />Thr656Ile/jd > ro674Ser on leptin recep- 
tor activity was determined with a quantitative reporter 
gene assay (adapted from [37]). HEK293 cells were tran- 
siently transfected with the murine long form of the lep- 
tin receptor (Lepr-b) in pcDNA3.1, a signal transducer 
and activator of transcription 3 {STATS) responsive Pho- 
tinus luciferase construct (pAH32), a constitutive Renilla 
luciferase expression vector for data normalization 
(phRG-b, Promega) and human SH2B1 splice variants 
beta and gamma with and without the mutations in 
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pCMV-XL5 expression vectors {Lepr-b and pAH32 were 
kindly provided by Rosenblum et al. [37]). For control 
empty pcDNA3 expression vector was transfected in- 
stead of SH2B1. After stimulation with mouse leptin 
(concentrations 0, 0.5, 1, 5 , 10, 50, 100, 500 ng/ml), the 
STAT3 reporter construct led to luciferase expression 
measured with the Dual-Luciferase Reporter Assay sys- 
tem according to manufactures' instruction (Promega). 
Dose-response curves, EC50 and Emax values were cal- 
culated by Graph Pad Prism. 

Results and discussion 

We performed a mutation screen of the coding region of 
SH2B1 in 95 extremely obese children and adolescents. 
We identified two unknown mutations and five known 
SNPs in SH2B1 (Table 2). All detected variants were fol- 
lowed up in a small, independent case-control sample, 
only non-synonymous variants were additionally geno- 
typed in further independent study groups. 

New infrequent variant |3Thr656lle/yPro674Ser 

A new mutation at position g.9483 (C/T) of SH2B1 results 
in a non-synonymous, non-conservative exchange in two 
of the three human splice variants (|3 and y) of SH2B1. 
Due to a shifted reading frame for the two splice variants, 
the mutation results in two different non-synonymous, 
non-conservative exchanges (/?Thr656Ile or yPro674Ser) 
in the (3 or y splice variants, respectively. The /?Thr656Ile/ 
yPro674Ser mutation was not detected in an independent 
sample of 179 extremely obese cases and 185 lean con- 
trols. As the mutation resulted in a non-synonymous 
amino acid exchange which was predicted to change pro- 
tein structure (Table 3), we additionally genotyped a total 
of 11,029 (extremely) obese or overweight children, 
adolescents and adults and 4,321 controls (for children 
and adolescents BMI < 90 th percentile, for adults BMI < 



25kg/m2) for this mutation. We detected two additional 
obese cases with this mutation and no mutation carrier 
among the controls. The extremely low frequency of the 
mutation limits the determination of an association to 
overweight and obesity (p = 1; Table 2). We calculated that 
the control group would need to include more than 
545,757 individuals to reveal a p-value below 0.05 with 
statistical power above 80%, if the observed trend (only 
mutation carriers among the overweight or obese indivi- 
duals) would remain stable. Both risk alleles (T-allele at 
g.9483C/T and G-allele at rs7498665) are potentially 
located on the same haplotype (as determined in one 
index patient and his mother who transmitted the haplo- 
type; for the other carriers, full genotype information of 
both parents was not available). A founder effect of this 
mutation is likely. 

All three detected /?Thr656Ile//Pro674Ser mutation 
carriers are female. The initially identified mutation car- 
rier (a) of the screening sample (height 163 cm, weight 
86.2 kg, BMI 32.44 kg/m 2 , age 12.7 years) as well as one 
mutation carrier (b) from the follow-up samples (height 
142 cm, weight 53.2 kg, BMI 26.38 kg/m 2 , age 9.9 years) 
had a BMI > 99 th percentile. In both cases, the over- 
weight or obese mother (BMI 25.76 kg/m 2 and 32.61 kg/ 
m 2 , respectively) transmitted the mutation to the ex- 
tremely obese child. The third mutation carrier (c) had a 
BMI > 90 th percentile (height 130 cm, weight 31.5 kg, 
BMI 18.64 kg/m 2 , age 7.2 years). For this mutation car- 
rier, genotypic information about the parents was not 
available (mother BMI 19.81 kg/m 2 , father BMI 26.7 kg/ 
m 2 ). Insulin levels were only available for one mutation 
carrier (b), whose level was in the normal range (9.4 
mU/1). Additional family members were not available. 

The amino acid exchanges in the p and y splice variants 
(/?Thr656Ile/)T > ro674Ser) are located outside the domain 
structure (self-dimerization, Pleckstrin-homology and SH2 



Table 2 Minor allele frequencies of the detected SNPs and mutations in SH2B1 excluding the screening group 



Position 



rs-Number Amino 



Genotypes cases 
11 12 22 



Genotypes controls 



Minor allele 11 



12 



22 Minor allele Odds 



95% 



Nominal 







acid 
exchange 








frequency 
cases [%] 








frequency 
controls [%] 


Ratio d 


Confidence 
Interval 


p-value 


g.2749C/A 


rs 147094247 


Thr175Asp 


11, 257 


11 


0 


0.05 


4,511 


1 


0 


0.01 


4.4 


0.57 - 34.13 


0.1 99 c 


g.8164A/G 


rs7498665 


Thr484Ala 


512 


1,526 


1,101 


40.62 


58 


195 


181 


35.83 


1.2 


1.06-1.42 


0.007 3 


g.8250OT 


rs60604881 




70 


87 


22 


36.59 


73 


75 


37 


40.27 


0.86 


0.63 - 1.15 


0.323 b 


g.8738A/G 


rs62037368 




176 


2 


0 


0.56 


182 


3 


0 


0.81 


0.69 


0.11 -4.16 


l b 


g.8764C/T 


rs62037369 




65 


81 


32 


40.73 


79 


82 


23 


34.78 


1.29 


0.95 - 1 .74 


0.1 07 b 


g.9483CAT 




(3Thr656lle, 
yPro674Ser 


1 1 ,206 


2 


0 


0.01 


4,506 


0 


0 


0.00 


NA 


NA 


1 c 


g.10182C/A 






178 


1 


0 


0.28 


184 


0 


0 


0.00 


NA 


NA 


0.493 b 



a genotyped in 3,230 obese cases and 439 lean controls. 

b genotyped in a total of 179 obese cases and 185 lean controls. 

c genotyped in a total of 11,406 obese and overweight cases and 4,568 (mainly population based) controls. 
d Odds ratio is given with respect to the minor allele. 
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Table 3 In silico prediction of splice sites, transcription factor binding sites and o-glycosylation sites of detected 
variants in SH2B1 



PROGRAM 






ESEfinder 


ESRSearch 


RESCUE_ESE 


TFSearch 


Consite 


OGPET 


SNPs 


Amino acid 
changes 


DNA 
position 


Splice sites 


Splice sites 


Splice sites 


transcription factor 
binding sites 


transcription factor 
binding sites 


O-glycosylation site 
prediciton [%] 


rs 147094247 


Thr175Asp 


g.2749C/A 


changed 


changed 


not changed 


not changed 


not changed 


33.1829 


rs7498665 


Thr484Ala 


g.8164A/G 


changed 


changed 


not changed 


not changed 


not changed 




rs60604881 




g.8250C/T 


changed 


SRp40, 
FOX1-FOX2 3 


not changed 


not changed 


not changed 




rs62037368 




g.8738A/G 


changed 


not changed 


not changed 


not changed 


GOUP-TF, c-REL b 




rs62037369 




g.8764C/T 


changed 


changed 


not changed 


GATA1, GATA2, 
MZF1 b 


not changed 




g.9483C/T 


|3Thr656lle, 
yPro674Ser 


g.9483C/T 


changed 


not changed 


not changed 


GATA1, GATA2 b 


not changed 


53.5329 


g.10182C/A 




g.10182C/A 


changed 


PESS a 


not changed 


not changed 


not changed 





a particular changed predicted splice sites, b particular changed predicted transcription factor binding sites. 



domain; [34]), which is relevant for the function of 
SH2B1. We analyzed the impact of the variant rare allele 
of y8Thr656Ile/}'Pro674Ser on leptin signalling in vitro via 
the STAT3 pathway. In both splice variants, /?656Ile or 
f674Ser showed unaltered leptin signalling (Figure 2, Add- 
itional file 1: Table S2). 

While the y Pro674Ser exchange in the y splice variant is 
predicted to be neutral by in silico programs, the exchange 
of /?Thr656Ile in the /> splice variant was predicted to be 
"not neutral" (SNAP), "pathological" (PMUT) or "disease 
causing" (Mutation taster) in three of four programs 
(Additional file 1: Table SI). The exchange in the /? splice 
variant would also destroy a predicted O-glycosylation site 
of the SH2B1 protein. Amino acid conservation was 



strong on both positions (86% for /3Thr656Ile and 100% 
for xPro674Ser, Additional file 1: Table SI). 

Previous in vitro data showed functional differences 
between f> and y SH2B1 variants: (a) It has been shown 
that the ft splice variant is mainly expressed in the hypo- 
thalamus [16], a brain region known to be implicated in 
weight regulation. A Sh2bl/? rescue was sufficient to 
prevent the Sh2bl knockout phenotype in mice [19]. 
Leptin signalling is mediated by the interaction of 
SH2BlyS and TAK2 [17]. The /S splice variant of SH2B1 
recruits insulin receptor substrates 1 and 2 (IRS1 and 2) 
to the LEPRb/JAK2 complex [38]. SH2BlyS enhances JAK2 
activity and promotes the activation of several down- 
stream networks like STAT3 and phosphatidylinositol (PI) 




0 0,5 5 50 500 0 0,5 5 50 500 

Leptin [ng/ml] Leptin [ng/ml] 

Figure 2 Leptin receptor activity measured by STAT3 mediated luciferase response. HEK293 cells (n = 8 separate experiments) were co- 
transfected with LEPRb, a STAT3 responsive element and SH2B1 splice variants beta (left) and gamma (right) with and without the infrequent 
alleles at rs7498665 (Thr484Ala) and p 1 Thr656lle/yPro674Ser. The dose response curves depict leptin receptor activity after stimulation with leptin 
(exact values for each data point see Additional file 1: Table S2). 
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3-kinase pathways [18,39]. Two in silico analyzes predicted 
an altered folding or function of SH2B1 /SThr656Ile (Add- 
itional file 1: Table SI), (b) The y splice variant of SH2B1 
is peripherally expressed. It interacts with Tyrll58 in the 
activation loop of the insulin receptor and prohibits 
dephosphorylation of IRS 1 and IRS2 [40]. This interaction 
enhances insulin signalling and insulin receptor auto- 
phosphorylation, leading in turn to activation of down- 
stream pathways [41]. While the three tyrosine motifs in 
the N-terminal part of SH2B1, which regulate interaction 
with the insulin receptor [42], are not directly affected by 
this exchange, it is possible that altered protein folding 
due to an non-conservative amino acid exchange in a 
highly conserved position prevents their phosphorylation. 

Coding GWAS derived SNP rs7498665 

We confirmed that the described risk allele G at SNP 
rs7498665 [2,3] is associated with obesity in our 705 
obesity trios (p = 0.009) and in a total of 3,139 independ- 
ent cases and 434 controls (p = 0.007, odds ratio (OR) = 
1.22, 95% confidence interval (CI) 1.06-1.42; Table 1). 
This coding SNP results in the non-synonymous, non- 
conservative amino acid exchange Thr484Ala in a slpice 
variant independent position with low conservation (5%, 
Additional file 1: Table SI). As the association with 
obesity was previously described for this SNP, we did not 
analyze further study groups (2-6). In vitro analyses 
revealed unaltered leptin signalling via STAT3 in both 
splice variants for the obesity risk allele at Thr484Ala 
(Figure 2). Both Emax and EC50 were non-significantly 
reduced (Figure 2, Additional file 1: Table S2). 

Since the obesity risk allele at the GWAS derived poly- 
morphism rs7498665 increases BMI by only approxi- 
mately 0.15 BMI units (kg/m ) as calculated in a 
population of 125,931 European individuals [1], we 
expected only subtle functional alterations associated 
with the minor allele of this variant. 

Other genetic variants in SH2B1 

The second newly detected mutation is located in the 3' 
UTR at base pair position g.10182 (C/A). This non- 
coding variant was detected twice within the screening 
sample, and once in an obese case in the association 
testing step (Figure 1). The variant showed no associ- 
ation to obesity in a small case control comparison (p = 
0.49; Table 2); in silico analyzes predicted a possible 
change in splice sites for this variant (Table 3). 

Results for the four other identified SNPs were as fol- 
lows: The third coding SNP rsl47094247 leads to a non- 
synonymous, conservative exchange (Thrl75Asp) at a 
conserved position (71%, Additional file 1: Table SI). No 
association to obesity (p = 0.199, odds ratio (OR) = 4.4, 
95% confidence interval (CI) 0.57 - 34.13; Table 1) was 
found for this SNP in a sample of 11,268 obese and 



overweight cases and 4,512 lean or normal weight controls 
(mostly population based). For the non-synonymous SNP 
rsl47094247 (Thrl75Asp), in silico analyzes predict a neu- 
tral outcome for the altered amino acid (Additional file 1: 
Table SI). 

For SNPs rs60604881 (p = 0.323, OR = 1.17, 95% CI = 
0.87-1.58), rs62037368 (p = 1.00, OR =1.45, 95% CI = 
0.24-8.71) and rs62037369 (p = 0.107, OR = 1.29, 95% 
CI = 0.95-1.74) evidence for association to early onset ex- 
treme obesity could not be detected (Table 2). For all 
SNPs in silico analyzes predict splice site or transcription 
factor binding site changes (Table 3). 

Leptin signalling 

The assay that measured STAT3 mediated leptin re- 
sponse successfully showed increased leptin response 
after co-transfection with wild type SH2B1 splice var- 
iants ji and y. This indicates that HEK293 cells and the 
STAT3 assay allow functional characterization of SH2B1. 
While in mice only the alpha splice variant was tested 
for leptin signalling [43], we observe an effect on leptin 
signalling for both other splice variants (/? and y) in our 
human cell system (Figure 2, Additional file 1: Table S2). 
The analysis of the impact of both SH2B1 variants on 
leptin receptor activity showed no significant reduction 
of STAT3 mediated signalling by the risk alleles at 
rs7498665 and y SThr656Ile/)T , ro674Ser. The non- 
significant decrease in EC50 and Emax for both tested 
variants in splice variants ji and y could indicate both 
gain of function and reduced function; the biological im- 
pact of both remains to be solved. If indeed a minor 
functional effect would be present, a much larger num- 
ber of replicates would be necessary to establish a sig- 
nificant effect (e.g. about 2x270 replicates when using 
the Emax point estimates of SH2Bly vs. SH2BlyP674S 
and their variances when applying Satterthwaite/Welch 
t-Test aiming at 80% power for two-sided a = 5%). Our 
results could, of course, indicate that the two variants 
are not functionally relevant. However, for a polygenic 
variant a large functional effect is rather unlikely. For ex- 
ample, the melanocortin 4 receptor gene (MC4R), a well 
known obesity gene, harbors two polymorphisms 
(Vall03Ile and Ile251Leu) that are negatively associated 
with obesity [44,45]. Carriers of the minor alleles have a 
BMI approx. 0.5 BMI units lower than wild type carriers 
[44,45]. Initial in vitro assays did not show functional 
implications for the minor alleles of these SNPs (e.g. 
[44]), but when the number of different assays was 
increased, in vitro tests showed potential small gain of 
function for both minor alleles [46], which could explain 
the weight lowering effect of the variant. Hence we 
speculate that the functional effect of the analyzed 
SH2B1 variants might become detectable when a battery 
of different functional tests is applied. Currently we have 
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first hints that both variants are compatible with a 
slightly reduced function. In addition, with STAT3 
mediated leptin signalling, we only tested one of the 
many potential interaction partners of SH2B1 in regula- 
tion of energy homeostasis. A potential additive effect of 
small functional changes in leptinergic and insulinergic 
signalling could result in stronger impact on body 
weight maintenance. 

Conclusion 

A recent mutation screen in 300 children from the 
GOOS cohort which display insulin resistance in 
addition to obesity revealed three variants and one SNP 
that showed an effect on cell differentiation and migra- 
tion, but with the exception of the frameshift variant 
Phe344fs no other functional deficiencies [47]. Compar- 
able to our study, Doche et al. analyzed the impact of 
detected variants on janus kinase 2 (JAK2) phosphoryl- 
ation with additional tests of insulin receptor substrate 2 
(IRS2) phosphorylation and SH2B1 dimerization [47]. 

Given the low frequency of ^Thr656Ile/^Pro674Ser 
(g.9483C/T), this mutation cannot explain our positive 
TDT for rs2008514 with obesity. Adding the three rare 
mutations detected by Doche et al. which show low func- 
tional impact still leaves a large proportion of BMI associ- 
ation inexplicable [47]. The region around SH2B1 on chr 
16pll.2 shows low recombination rates for approximately 
1Mb (chrl6:28, 177,800 shows a recombination peak of 
37cM/Mb and chrl6:28,944,400 a recombination peak of 
36 cM/Mb; HapMap, http://hapmap.ncbi.nlm.nih.gov/), 
implicating a large region with high linkage disequilib- 
rium. The area tagged by both BMI associated SNPs 
(rs7498665 and rs7359397 [1-3]) covers 17 genes (com- 
pare Additional file 1: Figure SI). Hence, relevant muta- 
tions in one of the remaining 16 genes might account for 
a larger proportion of the GWAS results. Alternatively, 
genetic variation outside of the SH2B1 coding region with 
a regulatory effect on this gene explains the association in 
functional terms. Guo et al. recently showed an intronic 
SNP in SH2B1 (rs4788099) which regulated mRNA 
expression of nearby genes Tu translation elongation fac- 
tor, mitochondrial (TUFM), coiled-coil domain containing 
101 {CCDC101), Homo sapiens spinster homolog 1 
(SPNS1), sulfotransferase family, cytosolic, 1A, phenol-pre- 
ferring member 1 (SULT1A1) and sulfotransferase family, 
cytosolic, 1A, phenol-preferring member 4 (SULT1A4) in B 
cells and monocytes [48]. This is in concordance with 
findings by Gutierrez-Aguilar et al. who reported differen- 
tial regulation of Sh2bl, Tufm and Sultlal in rats fed a 
high fat diet [17]. 

In conclusion, the rare allele of the variant /?Thr656Ile/ 
yPro674Ser in SH2B1 was found exclusively in three over- 
weight or obese children but not in normal-weight or 
underweight controls. Our findings suggest that this new 



rare mutation predisposes to increased BMI, possibly 
related to decreased leptin signalling. Further studies are 
warranted to investigate the functional impact of the mu- 
tation for both affected splice variants on the interaction 
of SH2B1 effector systems (e.g. leptin and insulin recep- 
tors), which play a major role in energy homeostasis. 

Additional file 
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Additional file 1: Table 1. In silico functional prediction of detected 
non-synonymous mutations in SH2BL Table 2: Parameters of leptin 
receptor activity measured by STAT3 mediated luciferase response. 
Figure 1: Identified variants in the three splice variants (a, (3 and y) of 
human SH2B1. SH2B1 mRNA - coding parts as filled blocks - (Ensembl 
sequences a: ENST00000322610, (3: ENST00000359285 and y: 
ENST00000337120). The domain structure (Quian and Ginty 2001) with 
dimerization, Pleckstrin homology and SH2 domain is shown as 
underlying grey boxes. Positions of detected variants are marked with 
lines. Available rs-numbers, if applicable amino acid exchanges and minor 
allele frequencies in obese cases (MAF according to Table 1) are given 
for each variant. Supplementary material: In silico analysis tool description. 
Supplementary Figure 2: Regional association and linkage disequilibrium 
plot of 1000 genome project data centered to SNP rs7498665 (http:// 
www.525broadinstitute.org/mpg/snap/). Displayed are recombination 
rate (blue), r 2 to rs7498665 (range of grey, increased intensity shows 
higher linkage) and genes in region. Dashed lines mark a region in high 
LD (r 2 >0.8) with rs7498665. Gene abbreviations: EIF3CIVEIF3C (eukaryotic 
translation initiation factor 3), CLN3 (ceroid-lipofuscinosis, neuronal 3), 
APOB48R (apolipoprotein B48 receptor), 1127 (interleukin 27), NUPRI (p8 
protein isoform a), CCDCI01 (coiled-coil domain containing 101), SULTIAI 
(sulfotransferase family, cytosolic, 1 A, member 1), SULTIA2 
(sulfotransferase family, cytosolic, 1 A.member 2), ATXN2L (ataxin 2 related 
protein isoform C), TUFM (Tu translation elongation factor, mitochondrial), 
SH2BI (SH2B adaptor protein 1 isoform 1), ATP2A1 (ATPase, Ca++ 
transporting, fast twitch 1 isoform), RABEP2 (rabaptin, RAB GTPase binding 
effector protein 2), CD/9 (CD19 antigen precursor), NFAFC2IP (Nuclear 
factor of activated T-cells, cytoplasmic 2-interacting protein), 5PNSI 
(spinster homolog 1 isoform 1), LAP (linker for activation of T cells isoform 
b). Figure 3: Conservation in C-terminal sequences of SH2B1 splice 
variants (p and y). Boxes mark the position of exchange g.9483C/T 
(/fFhr656lle/yPro674Ser) in p and y splice variants in several species. 
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